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Muchnik's theorem about simple conditional descriptions states 
that for all strings a and b there exists a program p transforming a to 
b that has the least possible length and is simple conditional on b. In 
^ this paper we present two new proofs of this theorem. The first one 

is based on the on-line matching algorithm for bipartite graphs. The 
second one, based on extractors, can be generalized to prove a version 
of Muchnik's theorem for space-bounded Kolmogorov complexity. An- 
\^ other version of Muchnik's theorem is proven for a resource-bounded 

t-h variant of Kolmogorov complexity based on Arthur-Merlin protocols. 
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^ 1 Muchnik's Theorem 

In this section we recall a result about conditional Kolmogorov complexity 
due to An. Muchnik [7j. By C(u) we denote Kolmogorov complexity of 
string u, i.e., the length of a shortest program generating u.. The conditional 
complexity of u given v, the length of a shortest program that translates v 
to u, is denoted by C(u\v), see [1]. 
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Theorem 1. Let a and b be two binary strings, C(a) < n and C(a\b) < k. 
Then there exists a string p such that 

• G(a\p,b) < O(logn); 

• C(p) < k + 0(logn); 

• C(p\a) < O(logn). 

This is true for all a, b, n, k, and the constants hidden in O(logn) do not 
depend on them. 

Remarks. 1. In the second inequality we can replace complexity C(p) 
of a string p by its length \p\. Indeed, we can use the shortest description of 
p instead of p. 

2. We may let k = C(a|6) + 1 and replace k + 0(\ogn) by C (a\b)+0 (log n) 
in the second inequality. We may also let n = C(a) + 1. 

3. Finally having \p\ < C(a\b) + O(logn), we can delete O(logn) last 
bits in p, and the first and third inequalities will remain true. We come 
to the following reformulation of Muchnik's theorem: for every two binary 
strings a and b there exist a binary string p of length at most C(a\b) such 
that C(a\p,b) < 0(logC(a)) and C(p\a) < 0(logC(a)). 

Informally, Muchnik's theorem says that there exists a program p that 
transforms b to a, has the minimal possible complexity C(a\b) up to a log- 
arithmic term, and, moreover, can be easily obtained from a. The last re- 
quirement is crucial, otherwise the statement becomes a trivial reformulation 
of the definition of conditional Kolmogorov complexity. 

This theorem is an algorithmic counterpart of Slepian-Wolf theorem [TT] 
in multisource information theory. Assume that somebody (S) knows b and 
wants to know a. We know a and want to send some message p to S that 
will allow S to reconstruct a. How long should be this message? Do we 
need to know b to be able to find such a message? Muchnik's theorem 
provides kind of a negative answer to the last question, though we still need 
a logarithmic advice. Indeed, the absolute minimum for a complexity of 
a piece of information p that together with b allows S to reconstruct a, is 
C(a\b). It is easy to see that this minimum can be achieved with logarithmic 
precision by a string p that has logarithmic complexity conditional on a and 
b. But it turns out that in fact b is not needed and we can provide p that is 
simple conditional on a and still does the job. 

In many cases statements about Kolmogorov complexity have combinato- 
rial counterparts (and sometimes it is easy to show the equivalence between 
complexity and combinatorial statements). In the present paper we study 
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two different combinatorial objects closely related to Muchnik's theorem and 
its proof. 

First (Sect. [2]), we define the on-line matching problem for bipartite 
graphs. We formulate some combinatorial statement about on-line match- 
ings. This statement (1) easily implies Muchnik's theorem and (2) can be 
proven using the same ideas (slightly modified) that were used by Muchnik 
in his original proof. 

Second (Sect. [3]), following [3j, we use extractors and their combinato- 
rial properties. Based on this technique, we give a new proof of Muchnik's 
theorem. With this method we prove versions of this theorem for polyno- 
mial space Kolmogorov complexity and also for some very special version of 
polynomial time Kolmogorov complexity. 

This work was presented on the CSR2009 conference in Novosibirsk, Rus- 
sia on 18-23 August, 2009, and the conference version of the paper was pub- 
lished in CSR2009 Proceedings by Springer- Verlag. This version of the paper 
is slightly rearranged and extended. 

2 Muchnik's Theorem and On-line Matchings 

In this section we introduce a combinatorial problem that we call on-line 
matching. It can be considered as an on-line version of the classical matching 
problem. Then we formulate some combinatorial statement about on-line 
matchings and explain how it implies Muchnik's theorem. Finally, we provide 
a proof of this combinatorial statement (starting with the off-line version of 
it) thus finishing the proof of Muchnik's theorem. 

2.1 On-line Matchings 

Consider a bipartite graph with the left part L, the right part R and a set of 
edges E C L x R. Let s be some integer. We are interested in the following 
property of the graph: 

for any subset V of L of size at most s there exists a subset 
E' C E that performs a bisection between L' and some R' C R. 

A necessary and sufficient condition for this property is provided by the 
well-known Hall's theorem. It says that for each set V C L of size t < s the 
set of all neighbors of elements of V contains at least t elements. 
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This condition is not sufficient for the following on- 
line version of matching. We assume that an adversary 
gives us elements of L (up to s elements) one by one. At 
each step we should provide a counterpart for each given 
element x, i.e., to choose some neighbor y G R not used 
before. This choice is final and cannot be changed later. 

Providing a matching on-line, when next steps of the adversary are not 
known in advance, is a more subtle problem than the usual off-line matching. 
Now the Hall's criterion, while still being necessary, is no more sufficient. 
For example, for the graph shown in the picture, one can find a matching for 
each subset of size at most 2 of the left part, but this cannot be done on-line. 
Indeed, we are blocked if the adversary starts with x. 

Now we formulate a combinatorial statement about on-line matching; 
then we show that this property implies Muchnik's theorem (Sect. |2.2 ) and 
prove this property (Sect. 2.3). 

Combinatorial statement about on-line matchings (OM). There 
exists a constant c such that for every integers n and k < n there exists a 
bipartite graph E C L x R whose left part L has size 2 n , right part R has size 
2 k n c , each vertex in L has at most n c neighbors in R, and for which on-line 
matching is possible up to size 2 k . 

Note that the size of the on-line matching is close to the size of R up 
to a polynomial factor, and the degrees of all L-elements are polynomially 
bounded, so we are close to Hall's bound. 



2.2 Proof of Muchnik's theorem 

First we show how (OM) implies Muchnik's theorem. We may assume with- 
out loss of generality that the length of the string a (instead of its complexity) 
is less than n. Indeed, if we replace a by a shortest program that generates 
a, all complexities involving a change by only O(logn) term: knowing the 
shortest program for a, we can get a without any additional information, and 
to get a shortest program for a given a we need only to know the value of 
C(a), because we can try all programs of length C(a) until one of them pro- 
duces a. There may exist several different shortest programs for a; we take 
that one which appears first when trying in parallel all programs of length 
C(a). As we have said, for similar reasons it does not matter whether we 
speak about C{p) or \p\ in the conclusion of the theorem. We used C(p) to 
make the statement more uniform; however, in the proof we get the bound 
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for \p\ directly. 

We may assume that n > k, otherwise the statement of theorem [T] is 
trivial (let p — a). Consider the graph E provided by (OM) with parameters 
n and k. Its left part L is interpreted as the set of all strings of length less 
than n; therefore, a is an element of L. Knowing b, we can enumerate all 
strings x of length less than n such that C(x\b) < k. There exist at most 
2 k such strings, and a is one of them. The property (OM) implies that it is 
possible to find an on-line matching for all these strings (in the order they 
appear during the enumeration). Let p be an element of R that corresponds 
to a in this matching. 

Let us check that p satisfies all the conditions of Muchnik's theorem. First 
of all, note that the graph E can be chosen in such a way that its complexity is 
O(logra). Indeed, (OM) guarantees that a graph with the required properties 
exists. Given n and k, we can perform an exhaustive search until the first 
graph with these properties is found. This graph is a computable function of 
n and k, so its complexity does not exceed the complexity of the pair (n, k), 
which is O(logn). 

If a is given (as well as n and k), then p can be specified by its ordinal 
number in the list of a-neighbors. This list contains at most n c elements, so 
the ordinal number contains O(logn) bits. 

To specify p without knowing a, we give the ordinal number of p in R, 
which is k + O(logn) bits long. Here we again need n and k, but this is 
another O(logn) bits. 

To reconstruct a from b and p, we enumerate all strings of lengths less 
than n that have conditional complexity (relative to b, which is known) less 
than k, and find /^-counterparts for them using (OM) until p appears. Then 
a is the L-counterpart of p in this matching. 

Formally speaking, for given n and k we should fix not only a graph G 
but also some on-line matching procedure, and use the same procedure both 
for constructing p and for reconstructing a from b and p. □ 

2.3 On-line Matchings Exist 

It remains to prove the statement (OM). Our proof follows the original Much- 
nik's argument adapted for the combinatorial setting. 

First, let us prove a weaker statement when on-line matchings are replaced 
by off-line matchings. In this case the statement can be reformulated using 
Hall's criterion, and we get the following statement: 
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Off-line version of (OM). There exists a constant c such that for any 
integers n and k < n there exists a bipartite graph E C L x R whose left part 
L is of size 2 n , the right part R is of size 2 k n c , each vertex in L has at most 
n c neighbors in R and for any subset X C L of size t <2 k the set N(X) of 
all neighbors of all elements of X contains at least t elements. 

We prove this statement by probabilistic arguments. We choose at ran- 
dom (uniformly and independently) n c neighbors for each vertex I e L. In 
this way we obtain a (random) graph where all vertices in L have degree at 
most n c ; the degree can be less, as two independent choices for some vertex 
may coincide. 

We claim that this random graph has the required property with positive 
probability. If it does not, there exists a set X C L of some size t < 2 k and 
a set Y of size less than t such that all neighbors of all elements of X belong 
to Y. For fixed X and Y the probability of this event is bounded by (^) n 
since we made tn c independent choices (n c times for each of t elements) and 
for each choice the probability to get into Y is at most l/n c (the set Y covers 
at most l/n c fraction of points in R). 

To bound the probability of violating the required property of the graph, 
we multiply the bound above by the number of pairs X, Y. The set X can 
be chosen in at most (2 n ) t different ways, since for each of t elements we have 
at most 2™ choices; actually the number is smaller since the order of elements 
does not matter. For Y we have at most {2 k n c ) t choices. Further we sum up 
these bounds for all t <2 k . Therefore the total bound is 
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This is a geometric series; the sum is less than 1 (which is our goal) if the 
base is small. The base is 

i \ n ° nn+k 

' " (2") (2V) = ~ 



/ n 



c(n c -l) 



and c = 2 makes it small enough, it even tends to zero as n — > oo. Off-line 
version is proven. □ 
Now we have to prove (OM) in its original on-line version. Fix a graph 
E C L x R that satisfies the conditions for the off-line version (for given n 
and k). Let us use the same graph in the on-line setting with the following 
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straightforward ( "greedy" ) strategy. When a new element x G L arrives, we 
check if it has neighbors that are not used yet. If yes, one of these neighbors 
is chosen to be a counterpart of x. If not, x is "rejected". 

Before we explain what to do with the rejected elements, let us prove 
that at most half of 2 k given elements could be rejected. Assume that more 
than 2 k ~ l elements are rejected. Then less than 2 k ~ 1 elements are served 
and therefore less than 2 k ~ l elements of R are used as counterparts. But 
all neighbors of all rejected elements are used (this is the only reason for 
rejection), and we get the contradiction with the condition #N(X) > j^X if 
X is the set of rejected elements. 

Now we need to deal with rejected elements. They are forwarded to the 
"next layer" where the new task is to find on-line matching for 2 k ~ l elements. 
If we can do this, then we combine both graphs using the same L and disjoint 
right parts R\ and R2] the elements rejected at the first layer are sent to the 
second one. In other terms: (n, k) on-line problem is reduced to (n, k) off-line 
problem and (n, k — 1) on-line problem. The latter can then be reduced to 
(n, k — 1) off-line and (n, k — 2) on-line problems etc. 

Finally we get k levels. At each level we serve at least half of the requests 
and forward the remaining ones to the next layer. After k levels of filtering 
only one request can be left unserved, so one more layer is enough. Note also 
that we may use copies of the same graph on all layers. 

More precisely, we have proven the following statement: Let E C L x R 
be a graph that satisfies the conditions of the off-line version for given n and 
k. Replace each element in R by (k + 1) copies, all connected to the same 
elements of L as before. Then the new graph provides on-line matchings up 
to size 2 k . 

Note that this construction multiplies both the size of R and the degree 
of vertices in L by (k + 1) (a polynomial in n factor). The statement (OM) 
is proven. □ 

3 Muchnik's Theorem and Extractors 

In this section we present another proof of Muchnik's theorem based on the 
notion of extractors. This technique was first used in a similar situation 
in [3] . With this technique we prove some versions of Muchnik's theorem for 
resource-bounded Kolmogorov complexity. This result was presented in the 
Master Thesis of one of the authors [5]. 



7 



3.1 Extractors 



Let G be a bipartite graph with iV vertices in the left part and M vertices 
in the right part. The graph may have multiple edges. Let all vertices of the 
left part have the same degree D. Let us fix an integer K > and a real 
number e > 0. 

Definition 1. A bipartite graph G is a (K,e)- extractor if for all subsets S 
of its left part such that #S > K and for all subsets Y of the right part the 
inequality 

#E(S,Y) W 



D-#S M <£ W 
holds, where E(S,Y) stands for the set of edges between S and Y. 

In the sequel we always assume that N, M and D (and sometimes other 
quantities denoted by uppercase letters) are powers of 2, and use correspond- 
ing lowercase letters (n, m, d, etc.) to denote their logarithms. In this case 
the extractor may be seen as a function that maps a pair of binary strings of 
length n = logiV (an index of a vertex on the left) and of length d = \ogD 
(an index of an edge incident to this vertex) to a binary string of length 
m = logM (an index of the corresponding vertex on the right). 

The extractor property may be reformulated as follows: consider a uni- 
form distribution on a set S of left-part vertices. The probability of getting 
a vertex in Y by taking a random neighbor of a random vertex in S is equal 
to #E(S,Y)/(D ■ #£); this probability must be e-close to #Y/M, i.e. the 
probability of getting a vertex in Y by taking a random vertex in the right 
part. 

It can be proven that (for an extractor graph) a similar property holds not 
only for uniform distributions on S, but for all distributions with min-entropy 
at least k = log K (this means that no element of L appears with probability 
greater than 1/K). That is, an extractor extracts m almost random bits from 
n quasi-random bits (with min-entropy k or more) using d truly random bits. 
For a good extractor m should be close to k + d and d should be small (as 
well as e). Standard probabilistic argument shows that for all n, k and e 
extractors with near-optimal parameters m and d do exist: 

Theorem 2. For all K, N, M and e such that 1 < K < N , M > 0, e > 0, 

there exists an (K,e)- extractor with 



D 
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So for given n and k we may choose the followings values of parameters 
(in logarithmic scale): 

d = log(n-A;) + 21og(l/e)+0(l) and m = k + d- 21og(l/e) - O(l). 

The proof may be found in [I] ; it is also shown there that these parameters 
are optimal up to an additive term 0(log(l/e)). 

So far no explicit constructions of optimal extractors have been invented. 
By saying the extractor is explicit we mean that there exists a family of 
extractors for arbitrary values of n and k, other parameters are computable 
in time poly(n), and the extractor itself as a function of two arguments is 
computable in poly(n) time. All known explicit constructions are not optimal 
in at least one parameter: they either use too many truly random bits, or 
not fully extract randomness (i.e., m <C k + d), or do not work for all values 
of k. In the sequel we use the following theorem proven in [2]: 

Theorem 3. For all 1 < k < n and e > 1/ poly(n) there exists an explicit 
(2 k ,e)- extractor for m = k + d and d = 0((lognloglogrz) 2 ). 

For the sake of brevity we use shorter and slightly weaker bound 0(log 3 n) 
instead of 0((lognloglogn) 2 ) in the sequel. 

3.2 The Proof of Muchnik's Theorem 

Now we show how to prove Muchnik's theorem using the extractor technique. 
Consider an extractor with some N, K, D, M and e. Let S be a subset of 
its left part such that #5 < K. We say that a right-part element is bad for 
S if it has more than 2DK/M neighbors in S (twice more than the expected 
value if neighbors in the right part are chosen at random and S has maximal 
possible size K), and we say that a left-part element is dangerous in S if all 
its neighbors are bad for S. 

Lemma 1. The number of dangerous elements in S is less than 2eK . 

Proof. We reproduce a simple proof from [3]. Without loss of generality 
we may assume that S contains exactly K elements (the sets of bad and 
dangerous elements can only increase when S increases.) 

For any graph the fraction of bad right-part vertices is at most 1/2, be- 
cause the degree of a bad vertex is at least twice as large as the average 
degree. The extractor property reduces this bound from 1/2 to e. Indeed, 
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let S be the fraction of bad elements in the right part. Then the fraction of 
edges going to bad elements (among all edges starting at S) is at least 25. 
Due to the extractor property, the difference between these fractions should 
be less than e. The inequality 5 < e follows. 

Now we count dangerous elements in S. If their fraction in S is 2e or 
more, then the fraction of edges going to the bad elements (among all edges 
leaving S) is at least 2e. But the fraction of bad vertices is less than e, 
and the difference between two fractions should be less than e due to the 
extractor property. □ □ 

Now we present a new proof of Muchnik's theorem. As we have seen 
before, we may assume without loss of generality that the length of a is 
less than n. Moreover, as we have said, we may assume that conditional 
complexity C(a\b) equals k — 1 (otherwise we decrease k) and that k < n 
(otherwise the theorem is obvious, take p = a). 

Consider an extractor with given n, k; let d = O(logn), m = k and 
e = 1/n 3 ; such an extractor exists due to Theorem [2} (The choice of e will 
become clear later). We choose an extractor whose complexity is at most 
2\ogn + 0(1). It is possible, because only n and k are needed to describe 
such an extractor: other parameters are functions of n and k, and we can 
search through all bipartite graphs with given parameters in some natural 
order until the first extractor with required parameters is found. (This search 
requires a very long time, so this extractor is not explicit.) 

Now assume that an extractor is fixed. We treat the left part of the 
extractor as the set of all binary strings of length less than n (including a), 
and the right part as the set of all binary strings of length m = k (we will 
choose p among them) . Consider the set Sb of all strings in the left part such 
that their complexity conditional on b is less than k (a belongs to this set). 

We want to apply Lemma [T] to the set Sb and prove that a is not dangerous 
in Sb (by showing that otherwise C(a\b) would be too small). So a has a 
neighbor p that is not bad for Sb, and this p has the required properties. 

According to this plan, let us consider two cases. 

Case 1. If a is not dangerous in Sb, then a has a neighbor p that is not 
bad for Sb- Let us show that p satisfies the claim of the theorem. 

Complexity of p is at most k + 0(1) because its length is k. 

Conditional complexity C(p\a) is logarithmic because p is a neighbor of 
a in the extractor and to specify p we need a description of the extractor 
(21ogn + 0(1) bits) and the ordinal number of p among the neighbors of a 
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(d = \ogD = O(logn) bits). 

As p is not bad for Sb, it has less than 2D neighbors in Sb- If b is known, 
the set Sb can be enumerated; knowing p, we select neighbors of p in this 
enumeration. Thus, to describe a given p and b, we need only a description of 
the extractor and the ordinal number of a in the enumeration of the neighbors 
of p in Sb, i.e., O(logn) bits in total. 

Case 2. Assume that a is dangerous in Sb- Since the set Sb can be 
enumerated (given b), the sets of all bad vertices (for Sb) and all dangerous 
elements in Sb can also be enumerated. Therefore, a can be specified by the 
string b, the extractor and the ordinal number of a in the enumeration of all 
dangerous elements in Sb- This ordinal number consists of k — 31ogn + 0(1) 
bits due to the choice of e (Lemma [I]). So, the full description of a given b 
consists of k — logn + O (log log n) bits (here O (log log n) additional bits are 
needed for separating n, k and the ordinal number). This contradicts the 
assumption that C(a\b) = k — 1. Thus, the second case is impossible and 
Muchnik's theorem is proven. □ 

3.3 Several Conditions and Prefix Extractors 

In [7] An. Muchnik proved also the following generalization of Theorem [lj 

Theorem 4. Let a, b and c be binary strings, and let n, k and I be numbers 
such that C(a) < n, C(a\b) < k and C(a\c) < I. Then there exist binary 
strings p and q of length k and I respectively such that one of them is a 
prefix of the other one and all the conditional complexities C(a\p, b), C(a\q, c), 
C(p\a), C(q\a) are of order O (log n). 

This theorem is quite non-trivial: indeed, it says that information about 
a that is missing in b and c can be represented by two strings such that one 
is a prefix of the other (though b and c could be totally unrelated). It implies 
also that for every three strings a, b, c of length at most n the minimal length 
of a program that transforms b to a and at the same time transforms c to a 
is at most max{C(a|6), C(a|c)} + 0(logn). 

In fact a similar statement can be proven not only for two but for many 
(even for poly(n)) conditions. For the sake of brevity we consider only the 
statement with two conditions. 

This theorem also can be proven using extractors. Any extractor can be 
viewed as a function E: {0, l} n x {0, l} d {0, l} m . 
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Definition 2. We say that a (2 fe , e)- extractor E: {0, 1}" x {0, l} d ->• {0, l} m 
is a prefix extractor if for every i < k its prefix of length m — i (a function 
Ei\ {0, l} n x {0, l} d —7- {0, l} m ~ l obtained by truncating i last bits) is a 
(2 k ~ l ,e)- extractor. 

By using probabilistic method the following theorem can be proven: 

Theorem 5. For all 1 < k < n and e > there exists a prefix (2 k ,e)- 
extractor with parameters d = logra + 21og(l/e) + 0(1) and m = k + d — 
21og(l/e)-0(l). 

Proof: This proof is quite similar to the standard proof of Theorem [3} In 
that proof the probabilistic argument is used to show that a random graph 
has the required property with positive probability In fact it is shown that 
this probability is not only positive but close to 1. Then we show that the 
restriction of a random graph is also a random graph, and the intersection 
of several events having probability close to 1 has a positive probability. Let 
us explain these arguments in more detail. 

We want to show that a random bipartite graph with given parameters 
is a prefix extractor with a positive probability. First of all we note that it is 
enough to show that inequality ([!]) holds for S of size exactly K. Then this 
condition is true also for every bigger set S, since the uniform distribution 
on S is an average of the distributions on its subsets of size K. Second, it is 
enough to check the bound only in one direction: 

#E(S,Y) W 
D-#S M 

(for all sets S of cardinality K and for all Y). Indeed, the inequality 

#E{S, Y) #y 
D ■ #5 M E 

follows from the previous one applied to the complement of Y: if there are 
too few edges from S to Y then there are too many edges from S to the 
complement of Y. 

Now we specify the distribution on graphs. For every string of length n (a 
vertex of the left part) we choose at random (uniformly and independently) 
D = 2 d strings of length m (its neighbors in the right part). Now we bound 
the probability of the event a random graph is not a prefix extractor. 
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If the extractor property is violated for some prefix of length m — i then 
there exists a set S of K/2 % elements from the left part and a set Y C 
{0, l| m ~* of size a2 m ~ l (for some a > 0) such that the number of edges 
between S and Y is greater than (a + e)KD/2 l . From the Chernov bound 
it follows that probability of this event is not greater than exp(— 2e 2 KD/2 l ). 
Hence, probability of the event a random graph is not a prefix extractor can 
be limited by the sum of such bounds for all i, S, and Y: 



^( K N /T )^ M/2 ^M-2e 2 KD/T 



Since g) < u v /v\ < (ue/v) v , this sum does not exceed 



^ / „ AT \ K/2 
i=0 



/ eN \ K I 21 



(K/2 i )(l+\n(2*N/K)) _ -e 2 KD/2 % \ _ / Mln2/2 ! _ -e 2 KD/2 l 



= E 

i=0 

The condition of the theorem implies that D > ^ ■ assuming that 0(1) 
constant is large enough. Hence, the second factor in each term of the sum 
is not greater than 1. Respectively, the first factor equals 

e (K/2 l )(l+ln(2 i N/K)-e 2 D) < e (K/2 i )(l+lnN-e 2 D) 

which is less than (l/2) ( - K ^ 2 '\ since De 2 > 1 + ln2 + In N . The sum of these 
terms is strictly less than 1. Thus, probability of the event a random graph 
is a prefix extractor must be positive. □ 
However, using prefix extractors is not enough; we need to modify the 
argument, since now we need to find two related neighbors in two graphs. So 
we modify the notion of a dangerous vertex and use the following analog of 
Lemma [TJ 

Lemma 2. Let us call a left-part element weakly dangerous in S if at least 
half of its neighbors are bad for S. Then the number of weakly dangerous 
elements in S is at most AeK . 

The proof is similar to the proof of Lemma [TJ since only half of all 
neighbors are bad, we need twice more elements. □ 
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Now we give a new proof of Theorem [5] based on prefix extractors. Fix a 
prefix extractor E with parameters n, k, d = O(logn), m = k and e = 1/n 3 
(again, we may assume that complexity of this extractor is 21ogn + 0(1)). 
We also may assume that C(a\b) = k — 1, C(a|c) =1 — 1 and (without loss 
of generality) k > I. 

Let 5b and S c be the sets of strings of conditional complexity less than k 
and I conditional on b and c respectively. Call an element weakly dangerous 
in Sb if it is weakly dangerous (in Sb) for the original extractor and weakly 
dangerous in S c if it is weakly dangerous (in S c ) for the /-bit prefix of E. 
Since this prefix E^_i is also an extractor, the statement of Lemma [2] holds 
for S c . The string a belongs to the intersection of Sb and S c and is not 
weakly dangerous in both. Hence, a random neighbor of a and its prefix are 
not bad for Sb [resp. S c ) with probability greater than 1/2. So we can find 
a fc-bit string p such that p and its /-bit prefix q are not bad for Sb and S c 
respectively. 

They satisfy the requirements. Indeed, the conditional complexities C(p\a) 
and C(q\a) are logarithmic because p and q can be specified by their ordinal 
numbers among the neighbors of a in the extractor. The string a may be 
obtained from p and b with logarithmic advice because p is not bad for Sb in 
E; similarly, a can be obtained from q and c with logarithmic advice because 
q is not bad for S c in E^_\. This completes the proof of Muchnik's theorem 
for two conditions. □ 



3.4 Muchnik's Theorem about Space-Bounded Com- 
plexity 



The arguments from Sect. 3J2 together with constructions of explicit ex- 
tractors imply some versions of Muchnik's theorem for resource-bounded 
Kolmogorov complexity. In this section we present such a theorem for the 
space-bounded complexity. 

First of all, the definitions. Let (p be a multi-tape Turing machine that 
transforms pairs of binary strings to binary strings. Conditional complexity 
C^ s (a\b) is the length of the shortest x such that (p(x,b) produces a in (at 
most) t steps using space (at most) s. It is known (see [1]) that there exists 
an optimal description method ip in the following sense: for every (p there 
exists a constant c such that 

C c ; lo ^ cs (a\b)<C^(a\b) + c 
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We fix such a method ip, and in the sequel use notation C' ,s instead of C^ s . 

Now we present our variant of Muchnik's theorem for space-bounded Kol- 
mogorov complexity: 

Theorem 6. Let a and b be binary strings and n, k and s be numbers such 
that C°°' s (a) < n and C°°' s (a|6) < k. Then there exists a binary string p such 
that 

• C°°' 0{s)+poly(n) (a|p,6) = 0(log 3 n); 

• C°°' 0{s) Go) < k + 0(\ogn)- 

• C°°' poly(n) (p|a) = 0(log 3 n), 

where all constants in O- and poly-notation depend only on the choice of the 
optimal description method. 

Proof. The proof of this theorem starts as an effectivization of the argument 



of Sect. |3.2[ To find p effectively, we use an explicit extractor with parameters 
n, k, d — 0(log 3 n), m = k and e = l/n 3 . We increase d and respectively 
the conditional complexity of p (when a is given) from O(logn) to 0(log 3 n), 
because (currently known) explicit extractors use more random bits than the 
ideal extractors from Theorem [2] The advantage is that to obtain p from a 
we now need only polynomial space (in fact, even polynomial time). 

First we prove a weaker version of the theorem assuming that the value 
of s is added as a condition (in three complexities that are bounded by the 
theorem). Later we explain how to get rid of this restriction. 

Recall that a right-part element is bad if it has more than DK/M neigh- 
bors on the left and a left-part element is dangerous if all its neighbors are 
bad. Let us show that if a is not dangerous and p is a neighbor of a that 
is not bad, then we can recover a from b and p using 0(log 3 n) extra bits 
of information and O(s) + poly(ra) space. For any string a' we can test in 
0(s) + poly(n) space whether C°°' s (a'\b) < k: We test sequentially all pro- 
grams of length less than k and check if they produce a' on space s given b. 
Simulating every such a program, we limit its workspace to s, and prevent 
infinite loops by counting the number of steps. If a program makes more 
than c s steps in space s then it loops; here c is some constant that depends 
only on the choice of the universal Turing machine. This counter uses only 
O(s) space. Therefore, given b and p we can enumerate all the strings a' that 
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are neighbors of p and C°°' s (a'\b) < k, and wait until a string with a given 
ordinal number appears. 

The difficulty arises when we try to prove that a is not dangerous. Let us 
try to repeat our arguments taking into account the space restrictions. First 
we note that one can enumerate (or recognize: for space complexity it is the 
same) all bad elements in the right part using space O(s) + poly(n). (As 
before, we assume here that s is given in addition to n, k, and b.) Indeed, 
bad elements (as defined above) have many neighbors among strings a 1 such 
that C°°' s (a'\b) < k, and those strings can be enumerated. 

Therefore, we can also enumerate all dangerous elements in the left part 
using space O(s) + poly(n). We know also that the number of dangerous 
elements is small, but this does not give us a contradiction (as it did before) 
since the space used by this enumeration increases from s to O(s) +poly(n), 
and even a small increase destroys the argument. So we cannot claim that a 
is not dangerous and need to deal somehow with dangerous elements. 



To overcome this difficulty, we use the same argument as in Sect. 2.3 We 



treat the dangerous elements on the next layer, with reduced k and other 
extractor graph. We need 0(k) layers (in fact even 0(k/ logn) layers) since 
by Lemma [I] at every next layer the number of dangerous elements that still 
need to be served is reduced at least by the factor 2e. Note also that the 
space overhead needed to keep the accounting information is poly(n) and we 
never need to run in parallel several computations that require space s (this 
space is needed only at the bottom level of the recursion, in all other cases 
poly(n) is enough). 

So we get the theorem in its weak form (with condition s). For the full 
statement some changes are needed. Let us sequentially use space bounds 
s' — 1, 2, . . .: to enumerate all strings a' such that C 00 ' s (a / |6) < k, we sequen- 
tially enumerate all strings that can be obtained from b and a fc-bit encoding 
using space s' = 1, 2, etc. The corresponding set increases as s' increases, and 
at some point we enumerate all strings a' such that C°°' s (a'\b) < k (though 
this moment is not known to us). Note that we can avoid multiple copies 
of the same string for different values of s': performing the enumeration for 
s', we check for every string whether it has appeared earlier (using s' — 1 
instead of s'). This requires a lot of time, but only 0(s) space. Knowing 
the ordinal number of a in the entire enumeration, we stop as soon as it is 
achieved; hence, the enumeration process requires only space O(s) +poly(n) 
(though s is not specified explicitly). 

Similarly, the set of dangerous words a (that go to the second or higher 
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layer) increases as s' increases, and can be enumerated sequentially for s' = 
1,2,3... without repetitions in O(s') + poly(n) space. Therefore, at every 
layer we can use the same argument (enumerating all the elements that reach 
this layer and at the same time are neighbors of p, until we produce as many 
of them as required) . □ □ 

Remarks. 1. The process of enumerating a' such that C°°' s (a'\b) < k 
sequentially for s' = 1, 2, 3, . . . can be considered as the enumeration of all a' 
such that C(a'\b) < k. So we just get the proof for the unrestricted version 
of Muchnik's theorem with an additional remark: if an explicit extractor is 
used, then the short programs provided by this theorem require only slightly 
more space than the programs given in the condition. 

2. When we use several layers (instead of a contradiction with the as- 
sumption that the complexity C(a\b) is exactly k — 1) we in fact do not need 
to use small e (like 1/n 3 that we have used in our argument); small constant 
value of e is enough. 

3.5 Muchnik's Theorem for CAM-complexity 

The arguments from the previous sections cannot be applied for Kolmogorov 
complexity with polynomial time bound. Roughly speaking, the obstacle 
is the fact that we cannot implement an exhaustive search over the list of 
'bad' strings in polynomial time unless P = NP. The best result that we can 
prove for poly-time bounded complexity involves a version of Kolmogorov 
complexity introduced in [9]: 

Definition 3. Let U n be a non- deterministic universal Turing machine. 
Arthur-Merlin complexity CAM (x\y) is the length of a shortest string p such 
that 

1. Piob r [U n (y,p,r) accepts, and all accepting paths return x] > 2/3 

2. U n (y,p, r) stops in at most time t (for all branches of non- deterministic 
computation) . 

As always, CAM* (a;) := CAM*(ac|A). 

This definition is typically used for t = poly(|rr[). Intuitively, a CAM- 
description p of a string x (given another string y) is an interactive Arthur- 
Merlin protocol: Arthur himself can do probabilistic polynomial computa- 
tions, and can ask questions to all-powerful but not trustworthy Merlin; Mer- 
lin can do any computations and provide to Arthur any requested certificate. 
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So, Arthur should ask such questions that the certificates returned by Merlin 
could be effectively used to generate x. With this version of resource-bounded 
Kolmogorov complexity we have a variant of Muchnik's theorem: 

Theorem 7. For every polynomial t\ there exists a polynomial ti such that 
the following condition holds. For all strings a, b of length at most n such 
that C tl ' n "°°(a|&) < k, there exists a string p of length fc + 0(log 3 n) such that 

. C' 2(n) '°°(p|a) = 0(log 3 n) and 

• CAM* 2(n) (a|M = 0(log 3 n). 

Proof: In the proof of this theorem we cannot use an arbitrary effective 
extractor. We employ very essentially properties of one particular extractor 
constructed by L. Trevisan [TO]. Our arguments mostly repeat the proof of 
Theorem 3 from [9]. 

First of all we recall the definition of the Trevisan extractor, which is 
based on the technique from the seminal paper by Nisan and Wigderson [12] . 
The first crucial ingredient of the Trevisan function is a weak design. A 
system of sets 

Si, • • • , S m C {1, . . . , d} 
is called a weak design with parameters (I, d) if each S{ consists of I elements 

and for every i > 1 the sum 2#( SinS ^ is bounded by (m — 1). Weak 

i=i 

designs exist; moreover, they can be constructed effectively. More precisely, 
there exists an algorithm that for any given I, m generates a week design with 
d = 0(l 2 logm) in time polynomial in I and m, see [T2] . 

Let us fix a weak design as above. For x £ {0, l} d we use the following 
notation: x\s i denotes the Z-bit string that is obtained by projecting x onto 
coordinates specified by S^. 

The second important ingredient of Trevisan's construction is an error 
correcting code. For every positive integer n and 5 > 0, there exists a list 
decodable code 

LDC n , 5 : {0,ir^{0,ir 
where n = poly(n/<5), such that 

1. LDC n j(x) can be computed in polynomial time; 



18 



2. given any y' G {0, 1}", the list of all x G {0, 1}™ such that x = 
LDC Hi s(x) and y agree in at least (1/2 + 5) fraction of bits, can be 
generated in time poly(n/5). In particular, this property means that 
the number of words x in this list is not greater than poly(n/5); 

(see, e.g., p3]). In the sequel we will assume that n is a power of 2. 

Let us fix an encoding as above and denote l{n) = logn. For u G {0, l} n 
the value LDC^^-u) is a string of length 2 l . So, we can view LDC nt s(u) as a 
Boolean function 

u: {0,1} 1 ^ {0,1} 

Having fixed a weak design Si, ... , S m and an encoding LDC nj< 5, we define 
the Trevisan function TR.J : {0, 1}" x {0, l} d ->• {0, l} m as 

TR s (u,y) =u{y\ Sl ) ...u(y\ Sm ) 

We do not need to show that TR is an extractor (for suitable values of 
n,d,m); in our proof we refer directly to the definition of this function. We 
will use the Trevisan function for S = ^ and m = k + d + 2. More precisely, 
the parameters are chosen as follows. Numbers k and n are taken from the 
statement of the theorem; l(n) = logn is obtained from the construction 
of LDC nj 5; it remains to choose an appropriate m and d = 0(/ 2 logm) = 
0(log 3 n) so that (i) there exists a week design with parameters m, /, d, and 
(ii) the equation m = k + d + 2 holds true. 

Denote by the set of all strings whose time-bounded complexity con- 
ditional on b is not greater than k: 

L b = {u G {0, l} n | C tl{n) '°°{u\b) < k} 

(obviously a G L\, and j^L^ < 2 k+1 ). We have chosen such an m that the TR- 
image of L b (i.e., the set of values TRs(u,p) for some u G L b and p G {0, l} d ) 
covers at most 50% of the set {0, l} m . Denote by B the predicate being in 
the TH-image of L b . Trivially, for every u G L b 

Prob ri ... rd [5(TR 5 ( U) n . . . r d )) = 1] - Prob ri ... rm [B(n ...r m ) = l]> 1/2 

(the first probability is equal to 1 and the second one is not greater than 
1/2). In other notation, we have 

P™b ye{0tl}d [B(u(y\ Sl )u(y\ S2 ) . . . u(y\sj) = 1] - 

-Prob ri ... rm [£?(r 1 ...r m ) = 1] > 1/2 
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We apply the standard 'hybridization' trick: we note that for some i 



Prob Virmj ... )rra [S(%| s Ju(y| S3 )...u(y| s .)r i+ i...r TO ) = 1] - 
Prob V)riir ^ 1) „., rm [5(%|sJ%|sJ...u(y|s i _ 1 )r i ...r m ) = 1] > l/(2m) (2) 

Further, we can somehow fix the bits of y outside of Si so that ^ remains 
true. Denote y\s t by x. Now all functions u{y\s j ) depend on #(Sj fl Si) bits 
from x (the other bits of y are fixed). Thus, every ^(ylsj) can be considered 
as a function iij(x), with a truth table of size 2^ SjnSi \ It follows that all 
functions w^lsj, . . . , w^/^ J together can be specified by 

3=1 

bits (the last inequality follows from the definition of weak designs). This 
argument holds for every u G {0, 1}™. We are interested in the case u = a, 
where a is the string from the statement of the theorem. We denote by p the 
concatenation of the truth tables of u(y\ 81 ), . . . , w(y| Si _J for u = a (so its 
length is less than m). To specify this p given a, we need to know only m, i 
and the bits of y outside of 5*. Hence, C poly(n) '°°(p|a) = 0(log 3 n). 

In the rest of the proof we show that there exists an Arthur-Merlin pro- 
tocol that reconstructs a given b, p and some small additional information. 
Since u — a, it is enough to reconstruct the string u (then we apply the 
decoding procedure and find u = LDC~\(u)). 

Let us investigate the inequality Q. To make the notations more concise, 
we denote F(x, . . . r m ) = B(ui(x) . . . Ui-\{x)ri . . . r m ), and 

( x r . r ) = { Ti it F(x,ri...r m ) = I 
9rA x i r «+i r m) y i _ r . otherwise 

Straightforward calculations imply 

PTob Xtn ... rm [u(x) = g n (x,r i+1 . . . r m )] > 1/2 + l/(2m) (3) 

(This is a standard argument from the computational XOR Lemma, see [H].) 
Now we fix a value of r« (set it to or 1) so that inequality ^ remains true. 
This bit must be included into the description of u given b and p. W.l.o.g. 
we assume that = 1, and in the sequel we omit in our notations. If the 
word p defined above and a "typical" sequence r^+i . . . r m are given, Arthur 
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can approximate u and then reconstruct u (using decoding algorithm for 
LDC n> 5). So, Arthur chooses at random several copies of r i+ i . . . r m and tries 
to approximate u with each copy. Further we explain how it works. 

We say that r G {0, l} m ~ l provides an a- approximation of u if ~Pioh x [g(x, r) = 
u(x)} > a. For every fixed r we identify the function g r (x) := g(x,r) with 
the string z r of length 2 l = n where every x-th bit equals 1 iff g(x,r) = 1. 
So, the number of l's in z r is equal to the number of strings x such that 
B{ui(x) . . . Mj_i(x)lr) = 1. 

Observation: If B(w) = 1 for some string w, Merlin can provide a certifi- 
cate for this fact. Indeed, he communicates to Arthur (i) some u, y such that 
TRj(ji, y) = w, and (ii) provides a poly-time program p' of length at most k 
such that p'{b) stops in t\ steps and returns u (i.e., Merlin proves to Arthur 
that u G Lt). 

We say that a string v G {0, l} n is a candidate if at least l/32m of all r G 
{0, l| m ~* provide an (1/2 + l/8m)-approximation for v. From the decoding 
property of the code LDC nj 5, each z G {0, 1}™ can be an approximation for 
at most q = poly(m) different codewords LDC nt s(u). Hence, there exist at 
most 32mq candidates (of course, it is a candidate). By Sipser's CD-coding 
theorem [15] there exists a poly-time program p" of length 2 log(32mg) = 
O(logn) that accepts u and rejects all other candidates (no warranty about 
non-candidates: p" may accept or reject any of them). 

First part of the Arthur— Merlin protocol: Denote 

g = Y,9(x,r)/2 m -\ 

x,r 

This is the average number of strings x G {0, 1}' such that g(x, lr) = 1 for a 
random r G {0, l} m ~\ 

At first Arthur chooses s random strings r(l), . . . ,r(s) of length (m — i) 
(a polynomial s = s(n) will be specified below). He asks Merlin to generates 
s ' {g — l) (t — l( n ) is specified in what follows) certificates for the facts 
that different tuples (x, lr(j)) satisfy g(x, lr(j)) = 1, and verifies these cer- 
tificates. If at least one certificate is false, Arthur stops without any answer. 
If the certificates are OK, Arthur calculates z[, . . . , z' s , where x-th bit of z'j 
is 1 iff Merlin provided a certificate of the fact that g(x, lr(j)) = 1. 

We need the following probabilistic lemma: 
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Lemma 3. For some rational 7 = n/ poly(m) and integer s = poly(n), with 
probability at least 2/3 Merlin can provide s-(g — j) certificates corresponding 
to random r(l), . . . , r(s), and (whatever certificates are chosen by Merlin) the 
following two conditions hold: 

• At least (s/16ra) of the strings 2y(i)> • • • > z ' r (s) (corresponding to the cer- 
tificates given by Merlin) provide some (1/2 + 1 /8m) -approximation to 
u. 

• For every v, if at least 1/lQm of the strings 2ym, • • ■ , 2y( s ) P rov ^de some 
(1/2 + 1/ 8m) -approximation to v, then v is a candidate. 

Proof: see Claims 17 and 18 in [9]. 

In our Arthur-Merlin protocol we use the parameters s and 7 from 
Lemma |3l 

Second part of the Arthur Merlin protocol. Arthur does not need 
anymore to communicate with Merlin. Now he composes the list of all 
codewords v that are (1/2 + l/8m)-close (coincide on a fraction at least 
(1/2 + l/8m) of bits) to at least s/lQm of strings z[, . . . , z' s . From Lemma[3] 
it follows that with probability at least 2/3 all strings in this list are candi- 
dates, and the string u is included in the list. The program p" defined above 
can distinguish u from other strings from the list. 

Thus, Arthur can find u in polynomial time if he is given b,p and the 
following additional information: the index i, the bit r^, the mean value g 
of positive values of B, and the distinguishing program p". In fact, it is 
enough to know not the exact value of g but only an approximation to this 
number; this approximation must be precise enough so that Arthur can find 
the integer part of sg. Thus, the required additional information contains 
only O(logn) bits. Now it is not hard to check that the described protocol 
of generating a satisfies the definition of CAM-complexity. 

Let us summarize the argument. The CAM-program for a consists of (i) 
the truth tables of u(y\ Sl ), . . . , w(2/L_i) for u = a (this is the most important 
part; we denoted is by p), (ii) the bit chosen so that ^ is true, (iii) a 
rational 7 and an approximation to a rational g, and (iv) Sipser's code p" 
that distinguishes u between all "candidates". The Arthur-Merlin protocol 
works as follows. Arthur chooses at random strings r(l), . . . , r(s). Merlin 
provides s ■ (g — 7) certificates corresponding to these r(j). Arthur computes 
the list of z[, . . . , z' s (strings with many approximations) corresponding to the 
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obtained certificates. Then Arthur selects u from the list of z'j using p", and 
computes a = LDC nt $(u). 

If Merlin is fair, this plan works OK with probability at least 2/3 (Lemma[3 
If Merlin wants to cheat, he has two options: provide a list of certificates such 
that the required string u is not in the list of z[, . . . , z' s , or such that at least 
one of z'j is not a candidate (in these cases Arthur fails to select u using 
p"). However from Lemma [3] it follows that for random z[, . . . , z' s both these 
options are closed with probability at least 2/3. □ 
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